--- layout: post title: "Reconstructing pictures with machine learning [demonstration]" date: "2016-02-09" author: Alex Rogozhnikov excerpt: Machine learning visualized with images tags: - notebook ---
In this post I demonstrate how different techniques of machine learning are working.
The idea is very simple:
Don't treat this demonstration as some 'comparison of approaches', because this problem (reconstructing a picture) is very specific and has very few in common with typical ML datasets and problems. And of course, this approach is not to be used in practice to reconstruct pictures :)
I am using scikit-learn and making use of its API, enabling user to construct new models via meta-ensembling and pipelines.
# !pip install image
from PIL import Image
%pylab inline
import numpy
from sklearn.pipeline import make_pipeline
from sklearn.ensemble import RandomForestRegressor, BaggingRegressor, GradientBoostingRegressor, AdaBoostRegressor
from sklearn.cross_validation import train_test_split
from sklearn.random_projection import GaussianRandomProjection
from sklearn.tree import DecisionTreeRegressor
from sklearn.linear_model import LinearRegression
from sklearn.kernel_approximation import RBFSampler
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVR
from sklearn.neighbors import KNeighborsRegressor
from rep.metaml import FoldingRegressor
from rep.estimators import XGBoostRegressor, TheanetsRegressor
I took quite complex picture with many little details
!wget http://static.boredpanda.com/blog/wp-content/uploads/2014/08/cat-looking-at-you-black-and-white-photography-1.jpg -O image.jpg
# !wget http://orig05.deviantart.net/1d93/f/2009/084/5/2/new_york_black_and_white_by_morgadu.jpg -O image.jpg
image = numpy.asarray(Image.open('./image.jpg')).mean(axis=2)
plt.figure(figsize=[20, 10])
plt.imshow(image, cmap='gray')
train_size is how many pixels shall be used in reconstructing the picture. By default, the algorithm will use only 2% of pixels
def train_display(regressor, image, train_size=0.02):
height, width = image.shape
flat_image = image.reshape(-1)
xs = numpy.arange(len(flat_image)) % width
ys = numpy.arange(len(flat_image)) // width
data = numpy.array([xs, ys]).T
target = flat_image
trainX, testX, trainY, testY = train_test_split(data, target, train_size=train_size, random_state=42)
mean = trainY.mean()
regressor.fit(trainX, trainY - mean)
new_flat_picture = regressor.predict(data) + mean
plt.figure(figsize=[20, 10])
plt.subplot(121)
plt.imshow(image, cmap='gray')
plt.subplot(122)
plt.imshow(new_flat_picture.reshape(height, width), cmap='gray')
not very surprising result
train_display(LinearRegression(), image)
train_display(DecisionTreeRegressor(max_depth=10), image)
train_display(DecisionTreeRegressor(max_depth=20), image)
train_display(DecisionTreeRegressor(min_samples_leaf=40), image)
train_display(DecisionTreeRegressor(min_samples_leaf=5), image)
train_display(RandomForestRegressor(n_estimators=100), image)
train_display(KNeighborsRegressor(n_neighbors=2), image)
to make predictions smoother
train_display(KNeighborsRegressor(n_neighbors=5, weights='distance'), image)
train_display(KNeighborsRegressor(n_neighbors=25, weights='distance'), image)
train_display(KNeighborsRegressor(n_neighbors=2, metric='canberra'), image)
train_display(XGBoostRegressor(max_depth=5, n_estimators=100, subsample=0.5, nthreads=4), image)
train_display(XGBoostRegressor(max_depth=12, n_estimators=100, subsample=0.5, nthreads=4, eta=0.1), image)
neural networks provide smooth predictions and are not able to deal with tiny sharp details of pictures.
train_display(TheanetsRegressor(layers=[20, 20], hidden_activation='tanh',
trainers=[{'algo': 'adadelta', 'learning_rate': 0.01}]), image)
train_display(TheanetsRegressor(layers=[40, 40, 40, 40], hidden_activation='tanh',
trainers=[{'algo': 'adadelta', 'learning_rate': 0.01}]), image)
base = make_pipeline(GaussianRandomProjection(n_components=10),
DecisionTreeRegressor(max_depth=10, max_features=5))
train_display(AdaBoostRegressor(base, n_estimators=50, learning_rate=0.05), image)
This is sometimes referred as Random Forest too (since this idea was proposed by Leo Breiman in the same paper).
base = make_pipeline(GaussianRandomProjection(n_components=15),
DecisionTreeRegressor(max_depth=12, max_features=5))
train_display(BaggingRegressor(base, n_estimators=100), image)
Feel free to download the notebook from repository and play with other images / parameters.
This post was written in IPython. You can download the notebook from repository.